dax: set instance lifespan from prepare to reset #10526

checkupup · 2026-02-06T07:42:03Z

At present, dax instance is initialized and allocated inner buffer on module_init(), and released on module_free(). The memory requirement for the inner buffer per instance is quite large.

The existing implementation has 2 pipelines containing dax. Even though the host is restricted from processing stream on both pipelines simultaneously, they may coexist with each other under some circumstances e.g. the interim when switching PCM device from one to the other, which may drain out the memory.

This commit changes the timing of instance allocation/deallocation to module_prepare()/reset() respectively. That is, dax instance only occupies the inner buffer memory when processing.

checkupup · 2026-02-06T07:43:07Z

PR that references #10503

sofci · 2026-02-06T07:44:06Z

Can one of the admins verify this patch?

reply test this please to run this test once

johnylin76

LGTM in overall. please help to test them.

johnylin76 · 2026-02-06T08:56:29Z

src/audio/module_adapter/module/dolby/dax.c

+	struct sof_dax *dax_ctx = module_get_private_data(mod);
+
+	/* dax instance will be established on prepare(), and destroyed on reset() */
+	destroy_instance(mod);


maybe also add if (dax_ctx) check to protect these lines

although I'm not sure if it can be reached. We can always be safe.

src/audio/module_adapter/module/dolby/dax.c

lgirdwood · 2026-02-06T12:19:27Z

@checkupup @johnylin76 I think we need to consider that DSPs are quite memory constrained devices and the asynchronous nature of PCM/pipeline creation/destruction within ALSA can easily lead to us with some heap fragmentation, especially when we need to allocate very large continuous buffers (and we had this problem with the key phrase buffer).

e.g. please consider

User plays audio to endpoint 1 with DAX instance 1
User selects endpoint2 with DAX instance 2, DAX 1 then frees memory.
Asynchronous audio event occurs before DAX 2 PCM is ready e.g. incoming call, system notification and allocates memory resources for its pipeline.
DAX instance 2 attempts to allocate large continuous buffer and fails due to event 3 fragmenting heap.

If we can allocate the DAX memory resources at init() (by the 1st DAX instance) and then pass/share the resources between DAX instances then we dont have to worry about the heap fragmentation and other audio events.

checkupup · 2026-02-09T05:35:19Z

@checkupup @johnylin76 I think we need to consider that DSPs are quite memory constrained devices and the asynchronous nature of PCM/pipeline creation/destruction within ALSA can easily lead to us with some heap fragmentation, especially when we need to allocate very large continuous buffers (and we had this problem with the key phrase buffer).

e.g. please consider

User plays audio to endpoint 1 with DAX instance 1

User selects endpoint2 with DAX instance 2, DAX 1 then frees memory.

Asynchronous audio event occurs before DAX 2 PCM is ready e.g. incoming call, system notification and allocates memory resources for its pipeline.

DAX instance 2 attempts to allocate large continuous buffer and fails due to event 3 fragmenting heap.

If we can allocate the DAX memory resources at init() (by the 1st DAX instance) and then pass/share the resources between DAX instances then we dont have to worry about the heap fragmentation and other audio events.

Hi, @johnylin76 @lgirdwood. If so, I would suggest to use BSS section for dax memories, similar to implementation in src/audio/google/google_rtc_audio_processing.c:

static __aligned(PLATFORM_DCACHE_ALIGN)
uint8_t aec_mem_blob[CONFIG_COMP_GOOGLE_RTC_AUDIO_PROCESSING_MEMORY_BUFFER_SIZE_KB * 1024];

This:

Avoid memory fragmentation
Avoid draining out the memory when switching PCM device from one to the other, see dax: set instance lifespan from prepare to reset #10503

However, there are some known issues:

How to ensure that multiple DAX wrapper modules can exist at the same time, but only one DAX instance is processing the stream? set instance lifespan from prepare to reset schema also needs this criteria.The different is that the former causes undefined behavior and the latter reports memory allocation fail.
BSS consumes real physical pages but virtual memory does not. I'm afraid this will consume more actual physical memory.

johnylin76 · 2026-02-09T08:42:51Z

If we can allocate the DAX memory resources at init() (by the 1st DAX instance) and then pass/share the resources between DAX instances then we dont have to worry about the heap fragmentation and other audio events.

Hi @lgirdwood,

Could you elaborate this? I'm not sure I have seen an example of module/codec that shares the resource between instances. Thanks

johnylin76 · 2026-02-09T09:10:36Z

Hi, @johnylin76 @lgirdwood. If so, I would suggest to use BSS section for dax memories, similar to implementation in src/audio/google/google_rtc_audio_processing.c:

static __aligned(PLATFORM_DCACHE_ALIGN)
uint8_t aec_mem_blob[CONFIG_COMP_GOOGLE_RTC_AUDIO_PROCESSING_MEMORY_BUFFER_SIZE_KB * 1024];

I'm not sure if BSS is a suitable solution. I'm afraid that may produce other issues. For example, the memory may directly reflect on the built image (*.ri) size and drag down the DSP load time.

How to ensure that multiple DAX wrapper modules can exist at the same time, but only one DAX instance is processing the stream? set instance lifespan from prepare to reset schema also needs this criteria.The different is that the former causes undefined behavior and the latter reports memory allocation fail.

That is indeed gimmicky. On SOF firmware side there is no such constraints actually. What we do so far is to regularize the mutual usage to ALSA PCM devices from the system side.

johnylin76 · 2026-02-09T09:28:58Z

Hi @checkupup @lgirdwood

As we still have lots of thoughts for optimizing the design anyway, can we settle down the current implementation first? By doing so we can have a workable version of DAX-supported firmware solution shortly to unblock the on-going project, while keeping the discussion of optimization in a new-created issue tracker.

Thanks.

lgirdwood · 2026-02-09T12:30:03Z

@checkupup @johnylin76 I think we need to consider that DSPs are quite memory constrained devices and the asynchronous nature of PCM/pipeline creation/destruction within ALSA can easily lead to us with some heap fragmentation, especially when we need to allocate very large continuous buffers (and we had this problem with the key phrase buffer).
e.g. please consider

User plays audio to endpoint 1 with DAX instance 1

User selects endpoint2 with DAX instance 2, DAX 1 then frees memory.

Asynchronous audio event occurs before DAX 2 PCM is ready e.g. incoming call, system notification and allocates memory resources for its pipeline.

DAX instance 2 attempts to allocate large continuous buffer and fails due to event 3 fragmenting heap.

If we can allocate the DAX memory resources at init() (by the 1st DAX instance) and then pass/share the resources between DAX instances then we dont have to worry about the heap fragmentation and other audio events.

Hi, @johnylin76 @lgirdwood. If so, I would suggest to use BSS section for dax memories, similar to implementation in src/audio/google/google_rtc_audio_processing.c:
static __aligned(PLATFORM_DCACHE_ALIGN)
uint8_t aec_mem_blob[CONFIG_COMP_GOOGLE_RTC_AUDIO_PROCESSING_MEMORY_BUFFER_SIZE_KB * 1024];
This:

Avoid memory fragmentation

Avoid draining out the memory when switching PCM device from one to the other, see dax: set instance lifespan from prepare to reset #10503

However, there are some known issues:

How to ensure that multiple DAX wrapper modules can exist at the same time, but only one DAX instance is processing the stream? set instance lifespan from prepare to reset schema also needs this criteria.The different is that the former causes undefined behavior and the latter reports memory allocation fail.

Expectation is that module ELF section should behave the same as built-in or as llext modules. i.e. if above BSS solution is causing an error we should log/fix it.

BSS consumes real physical pages but virtual memory does not. I'm afraid this will consume more actual physical memory.

On Intel HW both solutions consume PHY pages, so its not a loss here.

lgirdwood · 2026-02-09T12:34:30Z

If we can allocate the DAX memory resources at init() (by the 1st DAX instance) and then pass/share the resources between DAX instances then we dont have to worry about the heap fragmentation and other audio events.

Hi @lgirdwood,

Could you elaborate this? I'm not sure I have seen an example of module/codec that shares the resource between instances. Thanks

I was thinking here we could have a static context pointer in BSS which if NULL would mean DAX would allocate it and if not NULL DAX would us it during init(). That would give us a shared resource with mutual exclusion if we also had an atomic_t int around it for access.

I agree with other comments, this is quite a corner case given we have a high memory requirement on a device with limited memory.

lgirdwood · 2026-02-09T12:37:25Z

Hi @checkupup @lgirdwood

As we still have lots of thoughts for optimizing the design anyway, can we settle down the current implementation first? By doing so we can have a workable version of DAX-supported firmware solution shortly to unblock the on-going project, while keeping the discussion of optimization in a new-created issue tracker.

Thanks.

Ack - lets unblock the project and we can use existing PR, fwiw we have a lot of memory updates going in around userspace LL and this is a good example case that we want to support.

lgirdwood

LGTM, just some opens from @johnylin76

checkupup · 2026-02-10T03:40:38Z

Hi, @johnylin76 @lgirdwood , I have verified this change and found no issues. An additional mechanism was added to delay reset and free because I found that when executing sof_dax_reset, dax_process might be still processing. Please help review it, thx.

johnylin76 · 2026-02-10T07:56:59Z

Hi, @johnylin76 @lgirdwood , I have verified this change and found no issues. An additional mechanism was added to delay reset and free because I found that when executing sof_dax_reset, dax_process might be still processing. Please help review it, thx.

That is what I omitted and definitely bad. For LL modules that could never happen. Since DAX is running as DP component, the process function is encapsulated by ring buffers (LL) and running on another thread. As a result, sof_dax_process and sof_dax_reset are asynchronized.

The latest change of yours look logical correct, but I am worried about the race condition. Perhaps the process/reset signal can be implemented by a atomic_t variable in dax_ctx?

checkupup · 2026-02-10T09:10:53Z

well, I guess that sof_dax_set_configuration and sof_dax_process are also asynchronized. If so, I think I could submit another PR to fix asynchronized problem using spinlock or atomic_t. How do you think? @johnylin76 @lgirdwood

johnylin76 · 2026-02-10T08:36:52Z

src/audio/module_adapter/module/dolby/dax.c

+		return -EINVAL;
+	}
+
+	dax_ctx->update_flags |= DAX_PROCESSING_MASK;


I will suggest a new atomic variable in dax_ctx e.g. atomic_t proc_flags.

#include <rtos/atomic.h> static void set_dax_proc_flag(..., int32_t flag) { int32_t proc_flags = atomic_read(&dax_ctx->proc_flags); atomic_set(&dax_ctx->proc_flags, proc_flags | flag); } static bool get_dax_proc_flag(..., int32_t flag) { int32_t proc_flags = atomic_read(&dax_ctx->proc_flags); return !!(proc_flags & flag); }

This is so far the best approach in my mind, although it is not ideal.
(The lock should be grasped by process during the whole function)

But it should be fair enough given process calls will be much longer than reset

ack - DP/LL are different threads. DP thread is always preempted by synchronous LL thread every 1ms. So this solution looks fine to protect shared module resources.

DP thread is always preempted by synchronous LL thread every 1ms.

Oh that is critical. So basically we can assume the variable access on both threads won't get conflict. Then I assume the lock/atomic can be deferred.

I would still add the atomic flag as we could race in tearing down old DAX DP thread whilst creating new DAX DP thread as we transition. LL would interrupt this DP transition too if other audio is being processed and we have Linux userspace opening/closing ALSA PCMs too (which will be subject to user scheduling).

I agree. The probability is extremely small due to DP thread is always preempted by synchronous LL thread every 1ms, but that doesn't mean it's absolutely impossible. I will submit another PR to protect dax_ctx->update_flags.

johnylin76 · 2026-02-10T08:41:42Z

src/audio/module_adapter/module/dolby/dax.c

 	source_release_data(source, consumed_bytes);

 	check_and_update_settings(mod);
+	if (dax_ctx->enable && !dax_ctx->p_dax) {


Would this case happen?
Is dax_init a non-blocking function?

sequenceDiagram participant DP as DP Thread participant LL as LL Thread DP->>DP: Running sof_dax_process activate DP LL->>LL: Starts sof_dax_reset activate LL LL->>LL: sof_dax_reset is done deactivate LL DP->>DP: Calls check_and_update_state activate DP DP->>DP: check_and_update_state returns deactivate DP DP->>DP: sof_dax_process is done deactivate DP DP->>DP: sof_dax_process starts again activate DP

Loading

See the diagram, I'm worried that sof_dax_process will run another loop before being stopped.

I think that should be unreachable.

During the reset process, module_adapter will help on schedule_task_cancel https://github.com/thesofproject/sof/blob/main/src/audio/module_adapter/module/generic.c#L642
which will direct to zephyr_dp_schedule in our case. The task will be removed from the schedule list once reset done, which means sof_dax_process won't be scheduled after then.

Having said that, it does no harm to add the safe check. So LGTM.

I will remove this checking.

johnylin76 · 2026-02-10T12:17:42Z

well, I guess that sof_dax_set_configuration and sof_dax_process are also asynchronized. If so, I think I could submit another PR to fix asynchronized problem using spinlock or atomic_t. How do you think? @johnylin76 @lgirdwood

Yes. Given the module runs in DP mode, once the process task is scheduled we should not modify a bit from LL which will be read/written during process. So yes, the potential problem is bigger than we imagined.

In most cases we are good except the new change that we will release the data buffer on reset. In module_adapter_reset, the DP scheduler will be cancelled, while letting the running event (if there is one) finish itself. That is, reset won't be blocked on process which may cause buffer being released but still under processing.

I do agree the implementation with lock/atomic can be deferred as long as the buffer release issue above is being taken cares.

johnylin76 · 2026-02-10T15:01:35Z

src/audio/module_adapter/module/dolby/dax.c

+		return -EINVAL;
+	}
+
+	dax_ctx->update_flags |= DAX_PROCESSING_MASK;


DP thread is always preempted by synchronous LL thread every 1ms.

Oh that is critical. So basically we can assume the variable access on both threads won't get conflict. Then I assume the lock/atomic can be deferred.

At present, dax instance is initialized and allocated inner buffer on module_init(), and released on module_free(). The memory requirement for the inner buffer per instance is quite large. The existing implementation has 2 pipelines containing dax. Even though the host is restricted from processing stream on both pipelines simultaneously, they may coexist with each other under some circumstances e.g. the interim when switching PCM device from one to the other, which may drain out the memory. This commit changes the timing of instance allocation/deallocation to module_prepare()/reset() respectively. That is, dax instance only occupies the inner buffer memory when processing. Reported-by: Johny Lin <johnylin@google.com> Signed-off-by: Jun Lai <jun.lai@dolby.com>

In DP domain, sof_dax_process runs in a DP thread, other sof_dax_* functions run in a different LL thread. Using a atomic flags instead of original dax_ctx->update_flags to avoid potential race conditions when updating flags. Since dax_inf.h should not have any dependencies on SOF, we define aotmic flags in dax.c Signed-off-by: Jun Lai <jun.lai@dolby.com>

checkupup · 2026-02-11T09:04:03Z

Hi, @johnylin76 @lgirdwood I append another commit that implement aotmic flags in this PR, please help to review, very thx.

lyakh · 2026-02-11T11:51:29Z

src/audio/module_adapter/module/dolby/dax.c

+		return atomic_read(flags) & flag;
+	case DAX_FLAG_ADD:
+		if ((atomic_read(flags) & flag) == 0)
+			atomic_add(flags, flag);


I don't think this provides the synchronisation that you're trying to achieve here. The thread can be preempted between the two above lines. In general it's very difficult to implement synchronisation only with atomic variables. I think you really need a proper lock here, if you really need to protect against a race here. Also note, that yesterday a PR got merged that moves all DP processing in SOF to the userspace. With that your DAX DP thread will run in userspace too. Also with that your .reset() method now will be called in the same thread context synchronously with processing. So, as of today I don't think this race can appear, but please re-test.

checkupup requested review from dbaluta, kv2019i, lbetlej, lgirdwood, mmaka1 and plbossart as code owners February 6, 2026 07:42

checkupup mentioned this pull request Feb 6, 2026

dax: set instance lifespan from prepare to reset #10503

Open

checkupup force-pushed the main_dax_alloc branch from a4b3874 to 9411997 Compare February 6, 2026 08:02

johnylin76 reviewed Feb 6, 2026

View reviewed changes

lgirdwood reviewed Feb 9, 2026

View reviewed changes

checkupup force-pushed the main_dax_alloc branch from 9411997 to 2e1e6a3 Compare February 10, 2026 03:30

johnylin76 reviewed Feb 10, 2026

View reviewed changes

johnylin76 approved these changes Feb 10, 2026

View reviewed changes

checkupup added 2 commits February 11, 2026 15:38

checkupup force-pushed the main_dax_alloc branch from 2e1e6a3 to b2fc4d1 Compare February 11, 2026 09:01

lyakh requested changes Feb 11, 2026

View reviewed changes

dax: set instance lifespan from prepare to reset #10526

Are you sure you want to change the base?

dax: set instance lifespan from prepare to reset #10526

Conversation

checkupup commented Feb 6, 2026

Uh oh!

checkupup commented Feb 6, 2026

Uh oh!

sofci commented Feb 6, 2026

Uh oh!

johnylin76 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lgirdwood commented Feb 6, 2026

Uh oh!

checkupup commented Feb 9, 2026

Uh oh!

johnylin76 commented Feb 9, 2026

Uh oh!

johnylin76 commented Feb 9, 2026

Uh oh!

johnylin76 commented Feb 9, 2026

Uh oh!

lgirdwood commented Feb 9, 2026

Uh oh!

lgirdwood commented Feb 9, 2026

Uh oh!

lgirdwood commented Feb 9, 2026

Uh oh!

lgirdwood left a comment

Choose a reason for hiding this comment

Uh oh!

checkupup commented Feb 10, 2026

Uh oh!

johnylin76 commented Feb 10, 2026

Uh oh!

checkupup commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johnylin76 commented Feb 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

checkupup commented Feb 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

checkupup commented Feb 10, 2026 •

edited

Loading